Bug 149

Bug Description:

This is a starvation bug due to the wrong order of wait and notify.
More details about this bug are at POOL-149 JIRA page.

Interleaving Description:

	
org.apache.commons.pool.impl.GenericObjectPool

t1  t2  private synchronized void allocate() {
            ...
            // Second utilise any spare capacity to create new objects
            for(;;) {
    5            if((!_allocationQueue.isEmpty()) && (_maxActive < 0 || (_numActive + _numInternalProcessing) < _maxActive)) {
                    Latch latch = (Latch) _allocationQueue.removeFirst();
                    latch.setMayCreate(true);
                    _numInternalProcessing++;
                    synchronized (latch) {
    6                   latch.notify();
                    }
                } else {
                    break;
                }
         }
         
         public Object borrowObject() throws Exception {
            ...
1           _allocationQueue.add(latch);
            ...
            case WHEN_EXHAUSTED_BLOCK:
                try {
2                   synchronized (latch) {
                        if(maxWait <= 0) {
7                           latch.wait();
            ...
        }
        
        public void invalidateObject(Object obj) throws Exception {
            try {
                if (_factory != null) {
                    _factory.destroyObject(obj);
                }
            } finally {
                synchronized (this) {
    3               _numActive--;
    4               allocate();
                }
            }
    }	

Precondition: Pool is empty, _maxActive is N, _numActive is N and _numInternalProcessing is 0.

a) thread 1 borrows an object from pool. It creates a new latch and adds it into _allocationQueue at 1. There is one latch in _allocationQueue.
b) thread 1 has to wait for available object, so it comes to 2.
c) before thread 1 obtains the lock at 2, context switched, thread 2 calls invalidateObject() and decrease active number at 3. Now _numActive is N-1.
d) thread 2 calls allocate(), and pass the check at 5, because _allocationQueue is added with one latch at step a), and _numActive (N-1) + _numInternalProcessing (0) < _maxActive (N).
e) thread 2 obtains the latch lock and notify the latch at 6. This latch is the one added at step a).
f) context switched, when latch lock is released by thread 2, thread 1 obtains the lock and call latch.wait() at 7.
Because latch waits after notifies, it will wait forever.


How To Reproduce:

This bug is reproduced under pool 1.5 and JDK 1.6.0_33.
Execute the following scripts to run the test to reproduce the bug (assume the location of the pool test project is pool_test_home).

Linux:
${pool_test_home}/scripts/149.sh [--monitoroff]
Windows:
%pool_test_home%\scripts\149.bat [--monitoroff]

Example:
Use monitor to report and terminate the program when deadlock happens:
${pool_test_home}/scripts/149.sh

Turn off monitor:
${pool_test_home}/scripts/149.sh --monitoroff

Option Function
--monitoroff, -mo Turn off to stop reporting bug messages and ending program when test
runs into the expected concurrency bug which is a forever waiting.
User has to terminate the program manually when this option is set.